Search results for "Algorithm engineering"
showing 3 items of 3 documents
Mapreduce in computational biology via hadoop and spark
2017
Bioinformatics has a long history of software solutions developed on multi-core computing systems for solving computational intensive problems. This option suffer from some issues solvable by shifting to Distributed Systems. In particular, the MapReduce computing paradigm, and its implementations, Hadoop and Spark, is becoming increasingly popular in the Bioinformatics field because it allows for virtual-unlimited horizontal scalability while being easy-to-use. Here we provide a qualitative evaluation of some of the most significant MapReduce bioinformatics applications. We also focus on one of these applications to show the importance of correctly engineering an application to fully exploi…
Fast and Simple Approximation of the Diameter and Radius of a Graph
2006
The increasing amount of data to be processed by computers has led to the need for highly efficient algorithms for various computational problems. Moreover, the algorithms should be as simple as possible to be practically applicable. In this paper we propose a very simple approximation algorithm for finding the diameter and the radius of an undirected graph. The algorithm runs in $O(m\sqrt{n})$ time and gives an additive error of $O(\sqrt{n})$ for a graph with n vertices and m edges. Practical experiments show that the results of our algorithm are close to the optimum and compare favorably to the 2/3-approximation algorithm for the diameter problem by Aingworth et al [1].
ValWorkBench: an open source Java library for cluster validation, with applications to microarray data analysis.
2015
Background: Cluster analysis is one of the most well known activities in scientific investigation and the object of research in many disciplines, ranging from statistics to computer science. It is central to the life sciences due to the advent of high throughput technologies, e.g., classification of tumors. In particular, in cluster analysis, it is of relevance to assess cluster quality and to predict the number of clusters in a dataset, if any. This latter task is usually performed via internal validation measures. Despite their potentially important role, both the use of classic internal validation measures and the design of new ones, specific for microarray data, do not seem to have grea…